Development of Speech corpora for different Speech Recognition tasks in Malayalam language
نویسنده
چکیده
Speech corpus is the backbone of an Automatic speech Recognition system. This paper presents the development of speech corpora for different speech recognition tasks in Malayalam language. Pronunciation dictionary and Transcription file which are the other two essential resources for building a speech recognizer are also being created. Speech recognition performance of different speech recognition tasks are being presented. Speech corpus of about 18 hours have been collected for different speech recognition tasks. Keywords— Speech Recognition, corpus development, Malayalam
منابع مشابه
Influence of Native and Non-Native Multitalker Babble on Speech Recognition in Noise
The aim of the study was to assess speech recognition in noise using multitalker babble of native and non-native language at two different signal to noise ratios. The speech recognition in noise was assessed on 60 participants (18 to 30 years) with normal hearing sensitivity, having Malayalam and Kannada as their native language. For this purpose, 6 and 10 multitalker babble were generated in K...
متن کاملP65: Speech Recognition Based on Bbrain Signals by the Quantum Support Vector Machine for Inflammatory Patient ALS
People communicate with each other by exchanging verbal and visual expressions. However, paralyzed patients with various neurological diseases such as amyotrophic lateral sclerosis and cerebral ischemia have difficulties in daily communications because they cannot control their body voluntarily. In this context, brain-computer interface (BCI) has been studied as a tool of communication for thes...
متن کاملA Comparative Study of Gender and Age Classification in Speech Signals
Accurate gender classification is useful in speech and speaker recognition as well as speech emotion classification, because a better performance has been reported when separate acoustic models are employed for males and females. Gender classification is also apparent in face recognition, video summarization, human-robot interaction, etc. Although gender classification is rather mature in a...
متن کاملIssues in Design and Collection of Large Telephone Speech Corpus for Slovenian Language
In this paper, different issues in design, collection and evaluation of the large vocabulary telephone speech corpus of Slovenian language are discussed. The database is composed of three text corpora containing 1530 different sentences. It contains read speech of 82 speakers where each speaker read in average more than 200 sentences and 21 speakers read also the text passage of 90 sentences. T...
متن کاملWavelet Energy based Satistical Learning Approaches to Vocoid Consonant Recognition
State – of – the – art Automatic Speech Recognition (ASR) employs rigorous experimental evaluations on large, standard corpora from the real world. In recent years ASR and Machine Learning (ML) algorithms have had a great deal of influences on each other and feature selections can be considered as an essential task in ML. Compared with traditional basic speech feature extraction techniques, Wav...
متن کامل